Building Hybrid Knowledge Representations from Text
نویسندگان
چکیده
A significant obstacle to the development of intelligent natural language processing systems is the lack of rich knowledge bases containing representations of world knowledge. For experimental systems it is common practice to construct small knowledge bases by hand; however, this approach does not scale well to large systems. An alternative is to attempt to extract the desired information from existing knowledge sources intended for human consumption; however, attempts to construct broad-coverage knowledge bases using in-depth analysis have met with limited success. In this paper we present some work on an alternative approach that involves using shallow processing techniques to build a hybrid knowledge representation that stores information in a partially analysed form.
منابع مشابه
Knowledge Enhanced Hybrid Neural Network for Text Matching
Long text brings a big challenge to semantic matching due to their complicated semantic and syntactic structures. To tackle the challenge, we consider using prior knowledge to help identify useful information and filter out noise to matching in long text. To this end, we propose a knowledge enhanced hybrid neural network (KEHNN). The model fuses prior knowledge into word representations by know...
متن کاملA Joint Semantic Vector Representation Model for Text Clustering and Classification
Text clustering and classification are two main tasks of text mining. Feature selection plays the key role in the quality of the clustering and classification results. Although word-based features such as term frequency-inverse document frequency (TF-IDF) vectors have been widely used in different applications, their shortcoming in capturing semantic concepts of text motivated researches to use...
متن کاملRecherche d'information précise dans des sources d'information structurées et non structurées : défis, approches et hybridation
This paper provides a synthesis of text-based Question Answering (QA) approaches, with emphasis on models exploiting structured representations of texts, and recent QA approaches for knowledge bases. Our goal is to show the common issues, and to identify what can be a common approach, using both the relations from textual documents and the triplets from knowledge bases. We present a few works u...
متن کاملThe Effect of Feature Representation on MEDLINE Document Classification
This work explores the effect of text representation techniques on the overall performance of medical text classification. To accomplish this goal, we developed a text classification system that supports the very basic word representation (bag-of-words) and the more complex medical phrase representation (bag-of-phrases). We also combined word and phrase representations (hybrid) for further anal...
متن کاملInducing Event Schemas and Their Participants from Unlabeled Text a Dissertation Submitted to the Department of Computer Science and the Committee on Graduate Studies of Stanford University in Partial Fulfillment of the Requirements for the Degree of Doctor of Philosophy
The majority of information on the Internet is expressed in written text. Understanding and extracting this information is crucial to building intelligent systems that can organize this knowledge, but most algorithms focus on learning atomic facts and relations. For instance, we can reliably extract facts like “Stanford is a University” and “Professors teach Science” by observing redundant word...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2000